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Abstract 

Conditional Equi-concentration of Types on /-projections is presented. 
It provides an extension of Conditional Weak Law of Large Numbers to 
the case of several /-projections. Also a multiple /-projections extension 
of Gibbs Conditioning Principle is developed, /^-projection variants of 
the probabilistic laws are stated. Implications of the results for Relative 
Entropy Maximization, Maximum Probability, Maximum Entropy in the 
Mean and Maximum Renyi-Tsallis Entropy methods are discussed. 
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1 Terminology and Notation 

Let be a sequence of independently and identically distributed random 

variables with a common law (measure) on a measurable space. Let the measure 
be concentrated on finite number m of atoms from the set X = {x\, x%, . . . , x m } 
called support or alphabet. Let qi denote the probability (measure) of i-th 
element of X; q will be assumed strictly positive and called source or generator. 
Let P(X) be a set of all probability mass functions (pmf's) on X. 

A type (also called n-type, empirical measure, frequency distribution or oc- 
currence vector) induced by a sequence {X}^ is pmf u n 6 P(X) whose i-th 
element is defined as: = n.i/n where = J2?=i — Xi); there /(•) is 
the indicator function. Multiplicity T(u n ) of type v n is: T{v n ) — «!/II"=i n i- 
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Let II C P(X). Let P n denote a subset of P(X) which consists of all n-types. 

Let n„ = nnp„. 

On P(X) topology induced by the standard topology on R m is assumed. 

^-projection z>™ of q on II„ ^ is defined as: v n = arg sup„„ enii n(v n ;q), 
where 7r(i/ n ; q) = T(v n ) Y\{qi) nUi ■ Alternatively, the /^.-projection can be defined 
as v n = arg sup^ngn Tt(i> n \v n £ II n ;g), where -K{y n \v n G n„;q) denotes the 
conditional probability that if an n-type belongs to II„ then it is just the type 
v n . /i-projection can be also equivalently defined as the supremum of posterior 
probability, cf. [13] . 

/-projection p of q on II is p = arg inf pe n J(p||g), where I(p\\q) = LxK lo gfr 
is the /-divergence (also known as Kullback-Leibler distance, ± relative en- 
tropy). 

Tr(v n £ A\u n £ B; q) will denote the conditional probability that if a type 
drawn from q £ P(X) belongs to B C II then it belongs to A C II. 

2 Boltzmann Jaynes Inverse Problem and Con- 
ditional Law of Large Numbers 

Having the terminology introduced, Boltzmann Jaynes Inverse Problem (BJIP) 
can be stated as follows: there is the source q and a set II„ of n-types. It is 
necessary to select an n-type (one or more) from the set II„. To solve BJIP 
it is necessary to provide an algorithm for selection of type from Ii n when the 
information-quadruple {X, n, q, n„} and nothing else is supplied. Clearly, if IL n 
contains more than one type, BJIP becomes an under-determined and in this 
sense ill-posed problem. 

Usually, BJIP is solved by means of the method of Relative Entropy Maxi- 
mization (REM/MaxEnt). This is mostly done for n — > oo. In this case the set 
of types n„ effectively turns into a set of probability mass functions II. 

Typically, II is defined by moment consistency constraints (mcc) of the fol- 
lowing forn{3: n mcc = {p : YX=i P iUl = a ' Y^7=i "Pi = 1}: where a £ R is a given 
number, u is a given vector. The feasible set II mcc which mcc define is convex 
and closed, /-projection p of q on n mcc is unique and belongs to the exponential 
family of distributions; pi — k(\)qie~ Xui , where k(X) = 9i e ~ Au S an d A 

is such that p satisfies mcc. 

In the case of BJIP with II mcc , or in general for any closed, convex, rare set 
II, application of REM/MaxEnt method is justified by Conditional Weak Law 
of Large Numbers (CWLLN). CWLLN, in its textbook form, reads [J: 

CWLLN. Let X be a finite set. Let II be a closed, convex set which does not 
contain q. Let n — > oo. Then for e > and i = 1, 2, ... ,m, 

lim 7r(|i/f-ft|<e|i/"en;g) = l. 

n — >oc 

CWLLN says that if types are confined to the set II then they asymptotically 
conditionally concentrate on the /-projection p of the source of types q on the 
set II. Stated, informally, from another side: if a source q is confined to pro- 
duce types from convex and closed II it is asymptotically conditionally 'almost 

1 In the simplest case of single non-trivial constraint. 
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impossible' to find a type other than the one which has the highest /supremal 
value of relative entropy with respect to q. 

Conditional Weak Law of Large Numbers emerged from a series of works 

which include p, no], [in, eh, [si], mi, m, 0. i. i, EH], eg. 

For new developments see [23] . 

An information-theoretic proof (see [1]) of CWLLN utilizes so-called Py- 
thagorean theorem (cf. [5]), Pinsker inequality and standard inequalities for 
factorial. The Pythagorean theorem is known to hold for closed convex sets. 
Alternatively, CWLLN can be obtained as a consequence of Sanov's Theorem 
(ST). The ST-based proof of CWLLN will be recalled here. First, Sanov's 
Theorem and its proof (adapted from [7], [4]). 

Sanov's Theorem. Let X be finite. Let A C II be an open set. Then 
lim -log7r(^ n G A) = -I(p\\q), 

n — >oo Ji 

where p is an I -projection of q on A. 

Proof. [4], [7] n(v n G A) = 53i/»eA 7r ( i/ "'' <?)• Upper and lower bounds on 
n(v n \ q) (recall proof of the Lemma at Appendix): 

© m n(|) <^nCf) ■ 

X)i/"eA ^(^"i <?) < N Y[iLi(fk) n ^ , where N stands for number of all n-types 
and vf is an /-projection of q on A„ = An P„ (i.e., any of the n- types which 
attain supremal value of YViLiijy^)™ 1 '* )■ N is smaller than (n + l) m . 

Thus 

1 ( m \ 1 

- in £f log % + m(log m - log n) < - log 7r(^" £ A) 

\ i=l 1 I 

<^ ( n E^ lo g^ +mlo g( n+1 )) • 

Since A is by the assumption open and under the maintained assumption 
of strictly positive q it is also continuous, lim^oo ^ i>™ log -pr = ^p^log j:, 
where p is an /-projection of q on A. Thus, for n — ► oo the upper and lower 
bounds on i log7r(^ n G A) collapse into Y^Li Pi 1°§ f 1 - '-' 

A proof of CWLLN. [7] Let A = {p : \ Pi - p t \ > e, i = 1, 2, . . . , m}. Then ST 
can be applied to it, leading lim^oo i log7r(z/™ G A|^ n G II; g) = — (I(pa||<7) — 
J(pn|k))- Since 7(pa||<?) — ^(Pnlk) > and since the set LI admits unique I- 
projection (the uniqueness arises from the fact that the set is convex and closed, 
and is convex), the proof is complete. □ 

CWLLN can be viewed as a special case of a stronger result, which is com- 
monly known as Gibbs Conditioning Principle (GCP), see Sect. 5. 
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3 Motivation and Programme 



Frequency moment constraints considered by physicists (see for instance [33] ) 
define a non-convex feasible set of probability distributions which in general can 
admit multiple /-projections. This work builds upon [12], [27], [13] , [E], [H] 
and aims to develop an extension of CWLLN and Gibbs Conditioning Principle 
to the case of multiple /-projections. 

It has also another goal: to introduce concept of ^-projection and to formu- 
late /i-projection variants of the probabilistic laws. They, among other things, 
allow for a more elementary reading than their /-projection counterparts. At 
the same time they provide a probabilistic justification of Maximum Probability 
method [TT] . 

The paper is organized as follows: in the next section some basic questions 
regarding asymptotic behavior of conditional probability are posed. Two illus- 
trative examples are then used to introduce Conditional Equi-concentration of 
Types on /-projections. Next, an extension of Gibbs Conditioning Principle - 
the stronger form of CWLLN - is provided. Asymptotic identity of /-projections 
and /x-projections is discussed in Section 6 and /z-variants of the probabilistic 
laws are presented afterwards. Implications of the results for Maximum En- 
tropy, Maximum Probability and Maximum Renyi-Tsallis Entropy methods are 
drawn at Section 8. Section 9 mentions in passing other related results: r-tuple 
extension of CWLLN and Bayesian Conditional Law of Large Numbers. Section 
10 summarizes the paper. Appendix contains a sketch of proof of /CET and 
of Extended GCP. It also shows that concentration of types can in some sense 
happen also on isolated /-projections, provided that they are rational. 

4 Conditional Equi-concentration of Types 

What happens when II admits multiple /-projections? Do the conditional con- 
centration of types happens on them? If yes, do the type concentrate on each 
of them? If yes, what is the proportion? In order to address these questions, it 
is instrumental to consider a couple of examples. 

Example 1. [13] Let II = {p : YhLiP? ~ a = °.EiliPi -1 = 0}, where 
a, a G R. Note that the first constraint, known as frequency constraint, is 
non-linear in p and II is for |a| > 1 non-convex. 

Let a = 2, m = 3 and a = 0.42 (the value was obtained for p — [0.5 0.4 0.1]). 
Then there are the following three /-projections of uniform distribution q = 
[1/3 1/3 1/3] on II: p 1 = [0.5737 0.2131 0.2131], p 2 = [0.2131 0.5737 0.2131] 
and p 3 = [0.2131 0.2131 0.5737] (see [H]). Note that they form a group of 
permutations. As it will become clear later, it suffices to investigate convergence 
to say pi . 

For n — 30 there are only two groups of types in II: Gl comprises [0.5666 0.2666 
0.1666] and five other permutations; G2 consists of [0.5 0.4 0.1] and the other 
five permutations. So, together there are 12 types. 

Value of the square of the Euclidean distance S between v and p\ attains its 
minimum 8q\ = 0.0051 within Gl group for the following two types: [0.5666 
0.2666 0.1666], [0.5666 0.1666 0.2666]. Within G2 the smallest 5 G2 = 0.0532 is 
attained by [0.5 0.4 0.1] and [0.5 0.1 0.4]. 
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Thus, if e = ej is chosen so that the e-ball B(pi, ei) centered at p\ contains 
only the two types from Gl (which at the same time guarantees that p\ is the 
only /-projection in the ball), then n(u € B(p u ei)\v e U) = 2-0.1152 = 0.2304. 
In words: probability that if q generated a type from II than the type falls into 
the ball containing only types which are closest to the /-projection is 0.2304. If 
e = e 2 is chosen so that also the two types from G2 are included in the ball and 
also so that it is the only /-projection in the ball (any e 2 € (V0.0532, V0.1253) 
satisfies both the requirements), then n{y n e B(pi, e 2 )|^™ G IT) = i. 

For n — 330 there are four groups of types in II: Gl, G2 and a couple of 
new one: G3 consists of [0.4727 0.4333 0.0939] and all its permutations; G4 
comprises the type [0.5727 0.2333 0.1939] and its permutations. Hence, the 
total number of types from II which are supported by random sequences of size 
n = 330 is 24. 

5(33 for the two types from G3 which are closest to p\ is 0.0729. The smallest 
S Gi = 0.00077 is attained by [0.5727 0.2333 0.1939] and by [0.5727 0.1939 0.2333]. 
Thus, clearly, the two types from G4 have the smallest Euclidean distance to 
Pi among all types from If which are based on samples of size n = 330. Again, 
setting e such that the ball B(pi , e) contains only the two types which are closest 
to p\ leads to the 0.261 value of the conditional probability. Note the important 
fact, that the probability has risen, as compared to the corresponding value 
0.2304 for n = 30. 

Moreover, if e is set such that besides the two types from G4 also the second 
closest types (i.e. the two types from Gl) are included in the ball then the 
conditional probability is indistinguishable from i. Hence, there is virtually no 
conditional chance of observing any of the remaining 4 types. The same holds 
for the types which concentrate around p 2 or p^. Thus, in total, a half of the 
24 types is almost impossible to observe. 

The Example illustrates, that the conditional probability of finding a type 
which is close (in the Euclidean distance) to one of the three /-projections goes 
toi. 

Example 2. [13] Let n = niUn 2 , where H, = {p : Y^iL\Pi x i — a j \ Y^lLiPi = 
1}, j = 1,2. Thus n is union of two sets, each of whose is given by the moment 
consistency constraint. If q is chosen to be the uniform distribution, then values 
a\, 0,2 such that there will be two different /-projections of the uniform gonll 
with the same value of /-divergence (as well as of the Shannon's entropy) can 
be easily found. Indeed, for any a± = /i + A, a 2 = /i — A, where fi = EX and 
A e (0, (X max — X m i n )/2), pi is just a permutation of P2, and as such attains 
the same value of Shannon's entropy. To see that types which are based on 
random samples of size n from n indeed concentrate on the /-projections with 
equal measure note, that for any n to each type in Hi corresponds a unique 
permutation of the type in n 2 . Thus, types in e-ball with center at f>\ have 
the same conditional probabilities 7r as types in the e-ball centered at p2- This, 
together with convexity and closed-ness of both H,- , for which the conditional 
concentration of types on the respective /-projection is established by CWLLN, 
directly implies that 

lim Tr(veB(p j ,e)\v n eIl) = l j = 1,2. 

n — ►00 Z 
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Conditional Equi-concentration of Types on /-projections (/CET) attempts 
to capture behavior of the conditional measure which the above Examples illus- 
trate. To this end, notion of the proper /-projection will be needed. 

/-projection p of q on II will be called proper if p is not isolated point of II. 

/CET. Let X be finite. Let there be k proper I -projections p 1 ,p 2 , . . . ,p k of q on 
II. Let e > be such that for j = 1, 2, . . . , k p 3 is the only proper I -projection 
of q on IT in the ball B{p , e). Let n — > oo. Then for j = 1, 2, . . . , k, 

n(u n e B(e,p>)\v n e II; q) = 1/k. 

/CET says, informally, that source/generator q, when confined to produce 
types from a set II, - as n gets large - hides itself behind any of the proper 
/-projections equally likely. 

Expressed in Statistical Physics terminology /CET says that each of equi- 
librium points (/-projections) is asymptotically conditionally equally possible. 
The Conditional Equi-concentration of Types 'phenomenon' resembles the triple 
point phenomenon of Thermodynamics. 

A sketch of proof of /CET is relegated to the Appendix. 

5 Gibbs Conditioning Principle and its Exten- 
sion 

Gibbs conditioning principle (cf. [5], [5], [M]) - also known as the stronger form 
of CWLLN - complements CWLLN by stating that: 

GCP. Let X be a finite set. Let II be closed, convex set. Let n — > oo. Then for 
a fixed t, 

t 

lim 7t(Xl =xi,...,X t = x t \v n £ II; q) = Y\p Xl - 

n — >oc 

GCP, says, very informally, that if the source q is confined to produce se- 
quences which lead to types in a set II then elements of any such sequence (of 
fixed length t) behave asymptotically conditionally as if they were drawn iden- 
tically and independently from the /-projection of q on IT - provided that the 
last is unique (among other things). 

GCP was developed at [5] under the name of conditional quasi-independence 
of outcomes. Later on, it was brought into more abstract form in large deviations 
literature, where it also obtained the GCP name (cf. [5], [HI]). A simple proof 
of GCP can be found at [7]. GCP is proven also for continuous alphabet (cf. 

EH, 0, !)• 

The following theorem provides an extension of GCP to the case of multiple 
/-projections. 

EGCP. Let there be k proper I -projections p 1 ,p 2 , . . . ,p k of q on IT. Then for 
a fixed t and n — > oo , 

k t 

tt(Xi = xi, . . . , X t = x t \v n G II; q) = - £ JJp^. 

j=l 1=1 
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For t = 1 Extended Gibbs Conditioning Principle (EGCP) says that the 
conditional probability of a letter is asymptotically given by the equal-weight 
mixture of proper /-projection probabilities of the letter. For a general sequence, 
EGCP states that the conditional probability of a sequence is asymptotically 
equal to the mixture of joint probability distributions. Any (j-th) of the k joint 
distributions is such as if the sequence was iid distributed according to a (j'-th) 
proper /-projection. 

A proof of EGCP is sketched at the Appendix. 

6 Asymptotic Identity of /i-Projections and I- 
Projections 

At ([TT], Thm 1 and its Corollary, aka MaxProb/MaxEnt Thm) it was shown 
that maximum probability type converges to /-projection; provided that II is de- 
fined by a differentiable constraints. A more general result which states asymp- 
totic identity of //-projections and /-projections for general set II was presented 
at PS]. 

MaxProb/MaxEnt. Let X be finite set. Let M n be set of all /i-projections of 
q on II„. Let I be set of all L -projections of q on II. For n — > oo, M„ = I. 

Since w(i/ n ;q) is defined for v n £ Q m , //-projection can be defined only for 
II n when n is finite. The Thm permits to define a /i-projection v also on II: 
v — argsup ren — Yli=i r i 1°6 If- The //-projections of q on II and /-projections 
of q on the same set II are undistinguishable. 

It is worth highlighting that for a finite n, //-projections and /-projections 
of q on II„ are in general different. This explains why //-form of the probabilis- 
tic laws deserves to be stated separately of the /-form; though formally they 
are undistinguishable. Thus, the MaxProb/MaxEnt Thm (in its new and to 
a smaller extent also in its old version) permits directly to state //-projection 
variants of CWLLN, GCP, /CET and EGCP: //CWLLN, //GCP, //CET and 
Boltzmann Conditioning Principle (BCP). 

7 ^-Variants of the Probabilistic Laws 

//-variant of CWLLN reads: 

//CWLLN. Let X be a finite set. Let H be closed, convex set. Let n — ► oo. 
Then for e > and i = 1, 2, . . . , m, 

lim 7r(|z/*-i> i |<e|j/ l en;g) = l. 

n — >oo 

Core of //CWLLN can be loosely expressed as: types, when confined to a set 
II, conditionally concentrate on the asymptotically most probable type v. 

//-projection v of q on II will be called proper if v is not isolated point of II. 

//CET. Let X be finite. Let there be k proper ^-projections v 1 , v 2 , . . . , i> k of q 
on II. Let e > be such that for j = 1, 2, . . . , k &> is the only proper \i-projection 
of q onH in the ball B(p 3 , e). Let n — > oo. Then for j = 1, 2, . . . , k. 

TrO" G B(e,P)\v n eU;q) = 1/k. 
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Core of /i-variant of the Conditional Equi-concentration of Types states, 
loosely, that types conditionally concentrate on each of the asymptotically most 
probable types in equal measure. 

//GCP. Let X be a finite set. Let II be closed, convex set. Let n — * oo. Then 
for a fixed t, 

i 

lim 7r(Xi =X!,...,X t =x t \v n €U;q) = TT Ac r 

n — >oo 

1=1 

//-variant of EGCP deserves a special name. It will be called Boltzmann 
Conditioning Principle (BCP). 

BCP. Let there be k proper ^-projections V , v 2 , . . . , D k of q on II. Then for a 
fixed t and n — > oo, 

k t 

7r(X! = xx, . . . , X t = x t \v n e II; q) = - ^ J[ V XI . 

3 = 1 1 = 1 

8 Implications 

The results have some implications for application of REM, MaxProb and Max- 
imum Renyi-Tsallis Entropy methods to Boltzmann Jaynes Inverse Problem. 

8.1 I- or //-Projection? MaxEnt or MaxProb? 

With //-projection Maximum Probability method (MaxProb, [TT]) is associated. 
Given the BJIP information-quadruple {X, n, q, H n }, MaxProb prescribes to 
select from Ii n type(s) which has the supremal/maximal probability 7r(i/ n ; gj^|. 

/i-projections and /-projections are asymptotically indistinguishable. In 
plain words: for n — * oo the Relative Entropy Maximization method (REM/Max- 
Ent) (either in its Jaynes' [53], [5S] or Csiszar's interpretation jf)j) selects the 
same distribution(s) as MaxProb (in its more general form which instead of the 
maximum probable types selects supremum-probable //-projections). This result 
(in the older form, [TT]) was at [TT] interpreted as saying that REM/MaxEnt can 
be viewed as an asymptotic instance of the simple and self-evident Maximum 
Probability method. 

Alternatively, [3S] suggests to view REM/MaxEnt as a separate method 
and hence to read the MaxProb/MaxEnt Thm as claiming that REM/MaxEnt 
asymptotically coincides with MaxProb. If one adopts this interesting and le- 
gitimate view then it is necessary to face the fact that if n is finite, the two 
methods in general differ. This would open new questions. Among them also: 
MaxEnt/REM or MaxProb? (i.e., /- or /i-projection?) This is too delicate a 
question to be answered by one sentence. Let us note, only, that unless n — > oo 
entropy ignores multiplicity. 

2 A technique for determination of /^-projections was suggested at |17l . 
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8.2 1 1 \i- or r-projection? MaxEnt/MaxProb or MaxTent? 



The previous question (i.e., MaxEnt or MaxProb?) is a problem of drawing 
an interpretational consequences from two variants of the same probabilistic 
laws, and in this sense it can be viewed as an 'internal problem' of MaxEnt and 
MaxProb. From outside, from the point of view of the Maximum Renyi-Tsallis 
Entropy method (maxTent, [37], [3]) MaxProb and MaxEnt can be viewed 
as 'twins'. 

maxTent is to the best of our knowledge intended by its proponents for se- 
lection of probability distribution(s) under the setting of BJIP with II defined 
by X-frequency moment constraints (cf. [15). It is not known whether such a 
feasible set II admits unique distribution with maximal value of Renyi-Tsallis 
entropy (called r-projection at [15] ) as it is also not known whether /-projection 
on such a set is unique or not. The non- uniqueness makes it difficult to relay 
upon CWLLN when one wants to draw from an established non-identity of r 
and /-projection conclusion that maxTent method violates CWLLN, cf. [27] . 
At [T5] this difficulty has been avoided by considering an instance of the X- 
frequency constraints where the feasible set reduced into a convex set. Since r- 
and /-projection on the set were shown to be different, CWLLN directly im- 
plies that maxTent in this case selects asymptotically conditionally improbable 
distribution. The Example below (taken from [15]) illustrates the point. 

Example 3. [H] Let II = {p : Ef=iPi(^ - b ) = 0,E?=iK -1 = 0}. Let 
X = [-2 1] and let 6 = 0. Then II = {p : p| = 2p?,X>i -1 = 0} which 
effectively reduces to II = {p : p2 = 1 — pi(l + V2), J>3 = V2p\}. The source q 
is assumed to be uniform u. 

The feasible set LI is convex. Thus /-projection p of u on II is unique, and 
can be found by direct analytic maximization to be p — [0.2748 0.3366 0.3886]. 
Straightforward maximization of Renyi-Tsallis' entropy leads to maxTent pmf 
p T = [0.2735 0.3398 0.3867], which is different than p. 

Convexity of the feasible set guarantees uniqueness of the /-projection, and 
consequently allows to invoke CWLLN to claim that any pmf from IT other than 
the /-projection has asymptotically zero conditional probability that it will be 
generated by u. 

Obviously, /CET permits to show the fatal flow of maxTent in a more direct 
and more general way. 

9 Further Results 

Some further results related to asymptotic concentration of conditional proba- 
bility are contained in this Section. 

9.1 r-tuple JCET/CWLLN and MEM/GME Methods 

Maximum Entropy in the Mean method (MEM), or its discrete-case relative, 
Generalized Maximum Entropy (GME) method, are interesting extensions of 
the standard REM/MaxEnt methocH. Though, usually a hierarchical structure 

3 For a tutorial on MEM see [22]. GME was introduced at [ID], see also |3H . 
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of the methods is highlighted, here a different feature of the method(s) will 
appear to be important. 

First, Golan- Judge-Miller ill-posed inverse problem (GJMIP) has to be in- 
troduced. Its simple instance can be described as follows: Let there be two 
independent sources q , q 2 of sequences and hence types. Let X, Y be support 
of the first, second source, respectively. Let a set C„ comprise pairs of the types 
[i/*' 1 , v n ' 2 ] which were drawn at the same time. GJMIP amounts to selection of 
specific pair(s) of types from C„ when the information {X, Y, n, q^Lg 2 , C n } is 
supplied. 

Example 4. An example of GJMIP. Let X = Y = [12 3]. Let q 1 = q 2 = 
[1/3 1/3 1/3]; ^Ig 2 ; (q 1 i-> v n ' 1 )/\{q 2 h-> v n < 2 ). Let n = 100, C„ = {[is n '\ v n < 2 ] : 
i=i Xi+Vi' yi = 4; ^ i=1 v i = 1; ^ i=1 ^ = 1}. Given this informa- 
tion, it is necessary to select a pair (one or more) of types from C n . 

Since throughout the paper discrete and finite alphabet is assumed, GME 
will be considered instead of MEM, in what follows. The important feature 
of GME is that it selects jointly and independently drawn pairs (or r-tuples) 
of types/pmfs. Thus, it is suitable for application at the GJMIP context. An 
r-tuple extension of CWLLN (rCWLLN) provides a probabilistic justification 
to the GME, at the GJMIP context. 

Given GJMIP information, GME selects from the feasible set of the pairs of 
pmfs the one [p 1 ,^ 2 ] (or more) which maximizes sum of the relative entropies 
with respect to q x , q 2 ; respectively. 

(r = 2)-tuple CWLLN. Assume a GJMIP. Let C be convex, closed set. Let 
B(\p ,p ], e) be an e-ball centered at the pair 

m± p 1 ™ 2 p 2 

\p\p 2 } = arg sup V p\ log -\ + V p) log -| . 

Let n — > oo. Then 

n([v n >\u n < 2 ] e B(\p\p%e) \ [v n >\v n > 2 ] e C; (q 1 -> v n ' 1 )l\(q 2 ~ v n > 2 )- q 1 ±q 2 ) = 1 

Proof of rCWLLN can be constructed along the same lines as the proof 
of CWLLN; the assumption that the pairs of sequnces/types are drawn at the 
same time and from independent sources is crucial for establishing the result. 
Similarly, r-generalization of /CET can be formulated and proven; and obviously 
/^-variants of the results hold true. 

rCWLLN permits to rise the same objections to application of Renyi en- 
tropy based variant of GME in the GJMIP context, as those that were risen to 
maxTent in the BJIP context. 

Needless to say, //-variant of rCWLLN provides a probabilistic justification 
to MaxProb variant of GME. 

9.2 Bayesian Conditional Law of Large Numbers 

It is worth a brief mentioning, that there is an inverse problem which is in a sense 
antipodal to Boltzmann Jaynes Inverse Problem. Let us call it the /3-problem, 
after [25]. 
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One form of the /3-problem can be formulated as follows: let there be a set Q 
of sources over which a prior distribution ir(-) is specified. Let v n be an rt-type 
drawn from a source r, not necessarily in Q. It is necessary to select a source 
q £ Q, given the information-pentad {n, X, v n , Q, 7r(-)}. 

Conditional Law of Large Number for Sources [18] , [20] is concerned about 
the asymptotic behavior of posterior probability Tr(q e B|(g e Q) f\v n ). It states 
that, under certain conditions, the posterior probability asymptotically piles up 
on the L-projection of r on Q. Hence, the particular /3-problem has to be solved 
by L-divergence maximization method. 

An application of Conditional Limit Theorem for Sources to criterion choice 
problem associated with the Empirical Estimation [31], [32] as well as further 
discussion, can be found at [TO] , 

10 Summary 

Conditional Equi-concentration of Types on /-projections - an extension of 
CWLLN to the case of non-unique /-projection - was presented. /CET states 
that the conditional concentration of types happens on each of the proper /- 
projections in equal measure. Also, Gibbs Conditioning Principle was enhanced 
to capture multiple /-projections. Extended GCP says (when t = 1) that con- 
ditional probability of a letter is asymptotically given by the equal-weight mix- 
ture of proper /-projection probabilities of the letter. The conditional equi- 
concentration/equi-probability 'phenomenon' is in our view an interesting fea- 
ture of 'randomness'. It might be of some interest also for Statistical Mechanics 
as it resembles phase coexistence of Thermodynamics (eg. triple point of water, 
vapor and ice). 

A general form of MaxProb/MaxEnt Thm, which states asymptotic iden- 
tity of /- and /i-projections, was recalled. It permits to formulate /i-projection 
variants of the corresponding /-projection laws: CWLLN/GCP//CET/EGCP. 
In our view, the /i-variants allow for a deeper reading than their /-projection 
counterparts - since the /i-laws express the asymptotic conditional behavior of 
types in terms of the most probable types. For instance, /i-projection variant of 
CWLLN says that types conditionally concentrate on the asymptotically most 
probable one. This is, in our view, more obvious statement than that made 
by /-variant of CWLLN. MaxProb/MaxEnt Theorem is also instrumental for 
establishing of /CET. 

The main results - Conditional Equi-concentration of Types (CET) in both 
its /- and /i-projection form as well as Extended Gibbs Conditioning and Boltz- 
mann Conditioning - were supplemented also by further considerations. They 
are summarized below. 

Though /i-projections and /-projections asymptotically coincide, for a finite 
n they are, in general, different. In light of this fact, the asymptotic identity 
of /i- and /-projections can be viewed in two ways. Either as saying that 1) 
/-projection of q on II is the asymptotic form of /i-projection of q on II n or that 
2) /^-projections on II „ and /-projections on II n asymptotically coincide. Re- 
gardless of the preferred view, the /i-variants of the laws provide a probabilistic 
justification of Maximum Probability method (MaxProb, cf. [H]) (at least in 
the area of Boltzmann-Jaynes inverse problem). If the second view is adopted, 
then, when n is finite, it is necessary to face the challenge of selecting between 
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REM/MaxEnt method and MaxProb method. 

The results have a relevance also for Maximum Renyi-Tsallis Entropy method 
(maxTent), which is over the last years in vogue in Statistical Physics, max- 
Tent is to the best of our knowledge proposed as a method for solving BJIP, 
albeit with the feasible set II defined by non-linear moment constraints. Since, 
in general, maxTent distributions (r-projections on II) are different than I//J,- 
projections on II, ICET implies that the maxTent method selects asymptotically 
conditionally improbable/impossible distributions. 

A straightforward extension of CWLLN/CET for r-tuples of types was also 
mentioned. It was noted that the extension provides a justification to the Gen- 
eralized Maximum Entropy method in the area of Golan- Judge-Miller Inverse 
Problem. 

Conditional Law of Large Numbers for Sources and its implications for the 
/3-problem were also mentioned, in passing. 



11 Appendix 

11.1 MaxProb/MaxEnt 

MaxProb/MaxEnt. Let X be finite set. Let M n be set of all ^-projections of 
q on LI„. Let I be set of all I -projections of q on II. For n — > oo ; M n = I. 

Proof. [Hj Necessary and sufficient conditions for v n to be a ^-projection of q 
on IT„ are: a) Tr(P n ;q) > n(v n ;q), W" 6 II„; b) whenever v n has the property 
a) then ir(v n ;q) < ir(y n ;q). Requirement a) can be equivalently stated as: 



V 



my>-(nc-«r <•> 



and b) similarly. Standard inequality (n/e) n < n\ < n(n/e) n (valid for n > 6) 
allows to bind the LHS of (1): 

n«r r LHS n m / n u^?rHW) 1/n m 



n m /»nW ? (IM 1/n UiKY 

and similar bounds can be stated in the case of the requirement &J3- Since 
m is by assumption finite, as n — > oo the lower and upper bounds at (2) collapse 
into the ratio rK 1 '?) ' / EK^F) Consequently, the necessary and sufficient 
conditions a), b) for /^-projection turn as n — > oo into (expressed in an equivalent 
log-form): i) YH v f logff — z>™ log i>") > ~ ^f) l°g9» for all z/ 1 € LT n ; and ii) 

whenever v n has the property i) then ^ z>™ log log > Yl(^i'~ ) 1°S ft- 

Necessary and sufficient conditions for p to be an /-projection of q on II are 
the following: I) (Pi logp 4 - Pi log p 4 ) >J2{Pi~Pi) lo S 9« for all p G II; and IT) 
whenever p has the property 7) then Y^Pi logp; - p, log pi > £Xft - Pi) log ft. 

Comparison of i), ii) and I), IL) then completes the proof. □ 



4 Note that if an i-th component u" of a type is zero then it can be effectively omitted from 
calculations of 7r(f n ; q). Thus, it is assumed that product operations at (1), (2) are performed 
on non-zero components only. 
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11.2 JCET 



The conditional equi-concentration of types can be seen as a consequence of 
Sanov's Theorem and MaxProb/MaxEnt Theorem. Indeed, Sanov's Theorem 
implies that the probability n(v n G C; q) decays to zero for any open set C 
which excludes all of the /-projections. The asymptotic identity of I- and [i- 
projections shows that for n — > oo, the /-projections have the same value of the 
probability ir(v n ;q). 

The following is a rough attempt to make the argument a bit more formal. 
It relays upon MaxProb/MaxEnt Thm and the Lemma, which states a standard 
inequality for ratio of probabilities: 

Lemma. Let v n , v n be two types from IT„. Then 
Tr(v n ;q) f n \ m ~f~f ^K' 



O IT 



n(v n ;q) \mJ 11 

Proof. Tr(v n ;q) < U^ii^T"' ■ Sincc for n > 6, (n/e)™ < n! < n{n/e) n , it 
follows that ^«;q)>^X-n™i(f) ni>r - ni . . . n m < (^)"\ □ 

/CET. Let X 6e finite. Let there be k proper I -projections p 1 ,p 2 , ■ ■ ■ ,p k of q on 
LT. Let e > be such that for j = 1, 2, . . . , k p J is the only proper I -projection 
of q on LT in the ball B(p> , e). Let n — > oo. Then for j = 1, 2, . . . , k, 

7r(^™ e B(e,35 J )|z/ n en ; ?) = l/k. 

Proof. Clearly, 



(3) 



B„(e,p'')^B(£,p'')nn n . 

Without loss of generality, let there be unique /-projection of q on the 
ball B n (p>,e). (Sequence of the /-projections on LT„ converges to a proper I- 
projection of q on LT. To an /-projection on LT which is not proper, no sequence 
of /„-projections converges.) Also, without loss of generality let there be k 
/-projections p J n , j = 1, 2, . . . , k of q on II,,. 

Let A 4 B n \{p%h B = KnWnJJ ^,C = n„\B. 
Then the Right-Hand Side of (3) can be rewritten as: 



f(Pn„) 7r (Pn n ) 

By MaxProb/MaxEnt Thm /-projections have for n — > oo the same and 
supremal value of n(-). This implies that ^(p^/^iPn^) converges to 1 (the case 
of 0/0 limit is excluded by the supremity of 7r(-)). The same argument implies 
that the first ratio in the denominator converges to k — 1 . The Lemma implies 
that the ratio in the nominator as well as the second ratio in the denominator 
converge to zero. □ 
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11.3 Extended GCP 

EGCP. Let X be a finite set. Let II be such that it admits k proper I -projections 
p 1 ,^ 2 , . . . ,p k of q on II. Then for a fixed t, 

k t 

km n(X 1 =x 1 ,... > X t =x t \u n ell;q^u n ) = 

j=l Z=l 

Proof. Clearly, 

tt(X 1 =xx,...,X t =x t \v n e U;q^u n ) = £il = —— 

(5) 

Let, in addition to partitioning used in proof of /CET, D = Uj =1 {p J n }. 
Then the RHS of (5) can be rewritten as: 

E I /-eD 7r ( x i =xi,...,X t = x u v n ) +E^ G n„\D 7r ( x i =Xi,...,X t =x t ,v n ) 
^njl 1 + ^rj + ) 

(6) 

MaxProb/MaxEnt Thm implies that the first ratio in the denominator con- 
verges to k — 1. By the Lemma, the second ratio in the denominator of (6) 
converges to zero as n goes to infinity. The second term in the nominator 
as well goes to zero as n — > oo (to see this, express the joint probability 
tt(Xi = x\, . . . ,Xt — Xt,v n ) as tt(Xi = x\, . . . , X t — Xt\v n )n(u n ) and employ 
the Lemma). 

Then, MaxProb/MaxEnt Thm implies, that for n — ► oo the RHS of (6) be- 
comes equal to l/k^^_ 1 n(Xi — X\, . . . , X t = x t \fP). Finally, invoke Csiszar's 
'urn argument' (cf. [7j) to conclude that the asymptotic form of the RHS of (6) 

isi/kE^nL^. □ 

11.4 Rational /-projections 

Types can concentrate, in some sense, on rational /-projection p G Q m even 
though the /-projection is isolated point of the set H. The following Example 
illustrates the concentration. 

Example 5. Consider n = where p = [ni/no, • • ■ , n m /no] and p — 

[hi /no, . . . , h m /no], no £ N. For n ^ kno, k S N the set n„ is empty; otherwise 
it contains p and p. In this case, concentration of types on /z-projection is a 
direct consequence of the next two Lemmas. The /-variant of the concentration 
then arises from MaxProb/MaxEnt Thm. 

Lemma 1. Let v n , v n be two n-types. Let 5 = v n — v n . Let K denote the 
non-negative elements of nS, L the absolute value of negative elements of n5. 
Let c = hf '/ Y[+ n f z > where the subscript — , + indicates that the index i 

goes through the elements of K, L, respectively. Then wj|psrj < c k , for any 
c k . □ 
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Lemma 2. Let types v n , z>" be such that 7r(z>™ ; q) < n(u n ; q). Then ^|^ fc „ | A — ► 
as fc — > oo. 

Proof. By the assumption, [ji;"^"' < pfen ■ The gamma-ratio is, by the 
Lemma 1, smaller or equal to c, as defined at the Lemma. Thus, Jl?? 1- ™* = 
jc, where 7 S [0,1), 7 £ R. By Lemma 1, for any k G Z, ^^ t „ 1 < 

(l/ c ) fc II^^" ~" ^ The RHS of the inequality, 7 fc , goes for fc — > 00 to zero, 
which completes the proof. □ 

In this case, if II admits several rational ///^-projections, then clearly, types 
equi-concentrate on them. 
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